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AUTOMATIC CREATION OF A CODE GENERATOR 
FROM A MACHINE DESCRIPTION* 


Abstract 


This paper studies some of the problems involved in 
attaining machine independence for a code generator, similar 
to the language independence and the token independence at- 
tained by automatic parsing and automatic lexical systems. 
In particular, the paper examines the logic involved in two 
areas of code generation: computation and data reference. 
It presents models embodying the logic of each area and 
demonstrates how the models can be filled out by descrip- 
tive information about a particular machine. The paper 
also describes how the models can be incorporated into a 
descriptive macro code generating system (DMACS) to be 
used as a tool by a language implementer in creating a 
machine independent code generator, which can be made 
machine-directed by a suitable description of a particu- 
lar machine. 


*This report reproduces a thesis of the same title submitted 
to the Department of Electrical Engineering, Massachusetts 
Institute of Technology, in partial fulfillment of the re- 
quirements for the degree of Electrical Engineer, March 1971. 
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CHAPTER | 
1.1 INTRODUCTION 


The process of translating a high level language into 
machine instructions is traditionally divided into three 
distinct problems: lexical analysis, syntactic analysis, and 
code generation. The flow of data in such a translator is 


outlined in Figure 1.1. 


Source 
Program 


LEXICAL SYNTACTIC CODE 
ANALYSIS |7? | ANALYSIS GENERATION 


v 


Machine 
Code 


Figure 1.1: Simple Diagram of a Compfiler 


The lexical analyzer accepts a string of characters and 
groups these into identiffers and operators, etc., thus 
creating a string of lexical 'tokens'. The parser analyzes 
the underlying syntactic structure of this string of 

tokens, outputting either a sequence of macro operations or 
a parse tree. The code generator then translates the macros 


(or the parse tree structure) into machine instructions for 


a particular target machine, 


Both lexical analysis and syntactic analysis have 
been intensively studied. Johnson et al. (4) descrihe a 
system which allows a lexical analyzer to be automatically 
created from a series of regular expressions describing 
possible input lexical tokens. Similarly, numerous parsing 
schemes (1,2,3) have been developed wtich allow parsers of 
varying power to be created automatically from a 
context-free BNF description of a language. Very little 
work, however, has been done to similarly formalize and 
automate code generation, The present research represents 
an attempt to isolate some of the problems involved in code 
generation and to siiae how a code generator can be 
automatically created from a description of the computer 


upon which the code is to be run, 


The research does not attack all the problems that 
such an automatic code generating system would have to 
handie. Rather, ft deals with two subproblems corresponding 
to two common types of macro, namely: 

1. computational macros, such as ADD, MULTIPLY, OR, 
etc.; 


2. data reference, such as subscripting and structure 


reference. 
In this paper, we examine both types of macro fn turn and 
develop a model for the logic of such a macro. We then show 
how a system can be set up to perform the machine dependent 
part of such macro logic from machine descriptive 


Information. 


The two models developed for the operation of the two 
types of macro are different. As a result, the paper can be 
considered to contain two relatively fndependent topics: 
the first dealing with computational macros, and the second 


dealing with data reference macros. 


1.2 PREVIOUS WORK 


Although little work has been done to formalize code 
generation, a great deal of work has been done on the 
related problem of language transferability. One approach 
to this problem is that of the ‘mobile programming system! 
of Orgass and Wafte. (5,6) In their system, the source 
language {fs translated into a serfes of simple macros. Then 
a user-written set of macro definitions translates the 
macros into machine code. The problem of generating code 


for a new machine reduces to the problem of recoding the 


macro definitions. 


A second approach to language transferability fs that 
of the UNCOL macro language (7,8). UNCOL (UNIversal 
Computer Oriented Language) was developed In an attempt to 
create a universal macro language fnto which all high-level 
languages could be translated and which Itself could be 
translated fnto any machine code, If sucessful, the UNCOL 
system would have solved the problem of language 
transferability, since only one translator would ever have 
to be written for a language, and only one code generator 
for a machine. The Orgass and Waite system differs fror the 
UNCOL approach in that their macro language was specifically 
tailored to thelr source language. In practice, the 
restriction Imposed by havirg only one intermediate language 
for all source languages and all machines has proven too 


confining for a practical solution, 


The two systems just described are similar in that 
both attempted to solve the problem of language 
transferability by letting the user specify information 
about his machine in procedural form. Most of the 
information about machine structure is buried implicitly in 


the coding of the macros. Suck a procedural approach has 


eo ee 


been used in all major published work on code generation. 
In contrast, the present work uses fnformation about machine 


structure given in explicit, descriptive form. 


1.3 BRIEF HISTORY OF CODE GENERATION 


Early languages had very few data types. Fortran, for 
example, had only two data types. Similarly, early machines 
tended to have a small number of special-purpose registers. 
For such language-machine pairs, the process of generating 
code tended to be straight forward. A macro generally 
consisted of a short, [fndependent section of logic which 
performed a few simple tests and then output code. Thus a 
very simple procedural language could let the user define 
these macros (12). 

With the [ntroduction of more complicated machines 
and of languages with more data types, some of which (such 
as bitestrings) may be more complicated, code generation has 
become a harder task (9,13). Separate modules have become 
desirable to handle register manipulation and to handle 
data-dependent logic for the varfous data types. Such a 
modular approach allows a macro to he written fairly 
compactly, calling these modules as subroutines to locate 


free registers and to return usable representations of 
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operands (such as a displacement and registers containing a 


base and an fndex). 


In a traditional macro system, all of these modules 
and macros must be written by the user using a procedural 
language provided for the purpose. Due to the complexity of 
modern languages and machines, such a macro language can no 
longer be a very simple one. Similarly, the job of writing 


a code generator fis much more complex. 


1.4 DMACS: A DESCRIPTIVE MACRO SYSTEM 


This paper describes an automatic code generating 
system named DMACS. There are two steps in creating a 
code-generator using DMACS. The first step is to define a 
set of procedural macros in a machine independent, somewhat 
skeletal form. The second step 's to supply Information 
describing the computer for which code fs to he generated, 
DMACS uses this Information to flesh out the macro 
definitions. The two steps are quite independent, so that 
once the first step fs done for a language, the second step 
can then be done for a varlety of object machines. 
Similarly, once a machine has been described, implementing a 


second language requires little change to the machine 


Se AY a 


description. 


The first step can be thought of as defining the 
semantics of the language using machtne f!ndependent 
primitives. The second step can be thought of as defining 
the structure of the target machine. Examples of the two 
steps are discussed fn Chapters 3 and 4&. To facilitate 
these two steps, DMACS provides two languares: 

1. MIML- a procedural machine Independent macro 

language, and 

2. OMML- a declarative object machine macro language, 
Programs. written in the two languages are hound torether hy 


the DMACS system. 


Figure 1.2 outlines how the DMACS system is used. As 
can be seen, the traditional compile-time vs. run-time 
distinction has proliferated [nto four separate 'times' in 


viewing DMACS as a whole, 


1, Macro definition time- when a language fimplementer 


presents his machine independent macros to DMACS. 


2. Machine description time- when a machine specifier 
Inputs a description of his machine to ff11 out the 


machine fndependent macros. 
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1. Macro Definition Time: 


(language tmplementer) 


MIML program —_—_—> [macs ]-> 
4 


2. Machine Description Time: 


(machine specifier) 


3, Language Comp!lation Time: 


(programmer) 
Source program —> 


Machine 
Ortented 
Code 
Generator 


4, Program Executton Time: 


Input —> |Code —_—> output 


Using DMACS: & Users and & 'Times' 


Figure 1.2 


Pies ee 


3. Language compilation time- when a programmer inputs 


his source program to the compiler as a whole. 


&h&., Program execution time- when that compiled program is 


actually executed, 


1.5 OVERVIEW 


The present research develops models of two types of 
macros: computation and data reference macros. At the same 
time, the paper illustrates how these models can he built 
into DMACS as tools. These tools can be used by a language 
implementer to create machine independent macros defining 
the semantics of his language which can be filled out from a 


machine description. 


Chapter 2 gives the reader an overall introduction to 
code generation and to the DMACS system. {t also discusses 
some of the restrictions as to possikhle machine structure 


which are assumed in the following chapters. 


Chapter 3 presents a model of the logic of 
computational macros. The model pictures a code generator 
as a state machine whose state fs determined by the location 


of the values used in generating code, In the model, each 


computational macro has 'permitted' states for its operands, 
from which code can be emitted. For the !8M-360, for 
instance, the permitted states for integer addition would 
allow both operands in registers or one operand ina 
register and the other in a word of core memory. To 
generate code for such a macro, the code generator must make 
a transition into a permitted state and then emit an 


appropriate fnstructfion sequence from that state, 


Using a procedural macro system, the user specifies 


how such state transitions are to be made, tn a descriptive 
system such as DMACS, the transitions must be performed 
automatically from a description of the register and memory 
structure of a machine, and of the paths (load, store, 
register-register transfers) between core memory and 


registers. 


Chapter 4& turns to the prohlem of achieving the same 
machine independence for data reference macros. To achieve 
this goal, a data definition facility is built [nto DMACS, 
The language implementer writes his data reference logic in 
terms of the primitives of the facility. A machine 
specifier then describes his machine memory and how source 


data items are mapped into that memory. DMACS can then 
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characterize these source data ftems In terms of the 
primitives of the data definition facilfty. As a result, 


the macro logic is able to operate upon them. 


{In summary, the research is a step towards creating 
models of two aspects of the code generation process, and 
towards abstracting code generatton from any particular 
machine. In this paper we show how these models can be 
Implemented as tools to be used by a language [mplementer to 
create a machine !ndependent code generator which can be 
filled out from a machine description. Furthermore, [t Is 
seen that this approach to code generation, as a natural 
by-product, leads to a clean separation of the semantics of 
a source language from the structure of a particular target 
machine, a separation which fis often hard to fsolate ina 
compiler with a code generator orfented towards a particular 


machine. 


CHAPTER II 


A DESCRIPTION OF A CODE GENERATOR 
2.1 INTRODUCTION TO CODE GENERATION 


Code generation Is the last major task in the 
translation of a high-level language [nto machine language. 
A code generator receives its input from the syntactic 
analyzer (the parser). Although In some compilers the input 
is In the form of a parse tree, in this paper it Is assumed 
that the Input Is in the form of a linear sequence of macro 
operations. 


A=B+C* D; 


4™N 
A + 
To 
B * 1 MUL c,D 
I OS 2 ADD 1,8 
C D 3 ASSG A,2 
Parse Tree Macros 


This assumption is not a restriction, however, since a parse 
tree can readily be converted into such a sequence of 
macros. The task of the code generator is to convert the 


macros ftnto machine instructions. 
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In a compiler for a complex language with many data 
types, the code generator fs often allowed direct access to 
the symbol table constructed by the parser. The information 
in the symbol table can then be used directly to generate 
the correct code to access the different data items. The 


data flow in such a system is illustrated below. 


Source 
Program +> PARSER — Macros 
Vv 
Symbol = CODE Machine 
Table GENERATOR Code 


The parser converts the source program into macros, while 
simultaneously bullding the symbol table. The code 
generator then accepts both the macros and the symbol! table 


as input for generating machine instructions. 


A macro line consists of a line number, an 
operation, and that operation's operands: fe. 1 ADD X,Y. In 
an actual compiler, the line number is usually implicit, and 
the operation and the operands can be thought of as 
pofnters. The operation is a polnter into a table of macro 
definitions. The operands are either pointers to the symbol 


table entries describing the values to be operated upon, or 
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pointers to previous macro lines Indicating the results of 


previous macro operations. 


The paper discusses two particular kinds of macros: 
computational macros and data reference macros. The 


following example illustrates both types of macros. 


ACI) =B+C(JU)*D 


1 ss C,J 

i+1 MUL i,D 

i+2 ADD 1+1,8 
i+3 Ss A,1 

i+h ASSG 1+3,i+2 


In this example, SS (subscript) is a data reference macro, 


and MUL and ADD are computational macros. 


As an example of computational macro logic, consider 
Integer addition on the IBM 360. The 360 has two Add 
Instructions for integers: ‘'A' which adds a word of memory 
to a register, and ‘AR' which adds two registers. In 
generating code for an ADD macro, the code generator must 
check the location of the values to be added to see if 
elther of the instructions can he emitted directly. If not, 
the code generator must emit instructions to load one (or 
both) [Into registers. tf, in the process of finding a 


register to load [nto, the code generator must cause the 
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previous contents of a register to be stored, the new 
location of the stored value must be recorded. Furthermore, 
If one of the values to be added {fs not directly accessable, 

(te. a bit string value), the code generator must emit load 
and shift Instructions to tsolate that value In a register. 
Finally, after emitting the appropriate add instruction, the 
code generator must record the location of the macro's 


result. 


Similar examples of data reference loric are given in 


Chapter &., 


2.2 INTERNAL TABLES 


The symbol table contatns information about all the 
values (variables) declared by the programmer. At some 
point before code generation core locations must he 
allocated for these variables. The core location 
information can be stored in the symbol table entry for each 
item. Exactly how core allocation might be done is 
discussed In Chapter &. In addition to the values declared 
by the programmer, the code generator must also record the 
location of values which have been computed hy previous 


macro lines, but not yet used. tn most machines, a 


computation leaves Its result fn some register. Since the 
result can often later be used unmoved, ft is desirable to 
leave [t In the register if possible. If, however, an 
intervening macro requires that register for its 
computations, it fs necessary to store its contents ina 
‘temporary' In core and to remember that this has been 


done, 


To keep track of the location of such previous macro 
results, three tables are buflt Into the code generator: a 
macro result table (MRT), a register state tahle (RST), and 


a temporary table (TT). 


MRT: The macro result table records the location of a 
macro's result(s) if any. The MRT has one entry for 


each macro line. Fach value recorded in the entry 


consists of a pointer to the register or temporary where 


the value is located, 


RST: The register state tahle contains one entry for 
each register. Fach entry ftndicates whether that 


register contains a computed value, or if it is free. 


Each entry recording a computed value contains a pointer 


to the MRT record representing that value. Thus, when a 


register must be stored, the MRT entry can he easily 


SOY oe 
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changed to point to the temporary location where the 
value is to be stored. Each RST entry also contains 
fields which are used to flag a register with 

Information to be used jin selecting a register to be 


stored, 


TT: A temporary table can be implemented in vartous 
ways. For the purpose of this discussion, any 
implementation fs acceptable. One strategy fs to 
allocate a new temporary each time one is needed, in 
which case all that need be remembered outside the MRT 
Is the number of the last temporary allocated. A more 
efficient strategy fis to reuse temporaries after the 
results they hold are used, fin which case the TT must 


have an entry for each temporary allocated. 


2.3 THE ‘'GETREG' ROUTINE 


The internal tables descrihed in the preceding 
section allow computed values to be left fn the registers 
where they are computed, If such tables are not used, every 
computed value must be fmmedtately stored fn a temporary, 
which ts clearly undesfrable. If values are to be left fn 


registers, however, a routine must be provided which locates 
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free registers avallable for use. The paper refers to that 


routine as the GETREG routine. 


The GETRFEG routine fs passed the name of a register 
class as an argument. It cycles through that class looking 
for a free register. If none are found, the routine picks 
one of the registers and stores its current contents In a 
temporary, updatings the MRT entry pointing to that value. 
The priforittes used fn selecting which register to store, If 


there fs a choice, are discussed In Chapter 3, 


2.4 SOMF QUESTIONS TO BFE ANSVIFRED 


The previous sections give a brief introduction to 
code generation In general. The remainder of the chapter 
attempts to use the [ntroduction as a framework within which 
to outline exactly what aspects of code generation are to he 
dealt with in Chapters 3 and 4. Among the auestions to he 


clarified are these: 


1. What different types of machine structure do the models 
presented deal with? Clearly there ave many different types 
of machines, ranging from machines like the 7090 with 
special purpose registers, to machines lilke the PNP-10 with 


general purpose registers, to stack machines, and to 


ae ae 


microprogrammed machines capable of complicated runtime 
checks. Simtlarly, machines have differing addressing 
mechanisms: byte-addressing, word-addressing, Indexed or 
unindexed, based or not based, directly addressable or paged 
addressable (as fn many small machines), etc. The models 
presented are not capable of handling all possihle machine 


structures, 


2. What kind of values do the models presented deal with? 
Possible values in a computer are fntegers of different 
precisfon, booleans, bitstrings, floating pofnt numbers of 
different precision, decimal numbers, character strings, 
addresses, etc. The present research fs not concerned with 


all of these possible types of values. 


3. How are values allowed to map ftnto the machine 
structure? For Instance, are bitstring values to he allowed 
to cross word boundaries? How are different values assumed 


to be accessed? 


4, What is meant by ‘machine description'? Intuitively, one 
might expect machine description to entail somehow listing 
registers, core memory units and opcodes. On the other 
hand, might not a low-level code sequence, which 


accomplishes some primitive functton such as subtraction or 
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loading a value, be considered to be a reasonable part of a 
"machine description'? This question fs discussed in 


Section 2.9. 


2.5 ASSUMPTIONS AROUT MACHINE STRUCTURE 


The present research makes several simplifying 
assumptions about the structure of possible target 
machines, The assumpttons are spelled out in more detail] fn 


Chapters 3 and 4, 


Registers: The machine is assumed to have a set of registers 
for manipulating values. These may be efther special 
purpose or general purpose registers. The machine specifier 
describes the registers by naming them, grouping them into 
classes, and defining how they are used In manipulating 


data. Chapter 3 describes more precisely how this fs done, 


Core Memory: The whole of core memory [fs assumed to he 
directly addressable (as opposed to the paged addressabllity 
found on some small machines). It is assumed that the 
addressing fs done [n a machtne [fnstruction by either a 
displacement and an index, or by a displacement, an index, 
and a base. The machine specifier must [indicate which 


registers may be used as indices and bases. In generating 
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an address, DMACS creates an internal ‘generated address' 
consisting of a displacement, fndex, and hase (the index or 
base may be nil). If both index and base are present ina 
generated address, however, and the particular target 
machine allows only an index, then DMACS generates code to 
add the base and index together, thus transforming the 
"generated address' into a ‘machine address' for that target 


machine, 


2.6 ASSUMPTIONS ABOUT VALUES 


In a complex real-world compiler, many types of 
values can be used as operands. Elson and Rake (9) discuss 
some of the involved problems of writing macro definitions 
for a complicated language (PL/I). The present work does 
not attempt to handle the complexity of such a language; 
rather, fit makes certain simplifying assumptions as to the 
types of values to be allowed as operands. The restrictions 
allow a reasonably simple model of code generation to be 
constructed which exposes some of the basic conceptual 
processes and problems [fnvolved, without hecoming bogged 


down in a huge ad-hoc mess. 


The model of a code generator presented in this paper 
Is set up to handle values which, Intultively, are of the 
Integer (or Integer bit-string) and floating point variety; 
values which are manipulated vila registers and thus are no 
larger than the registers used on the particular target 
machine. Character-string and decimal values are not 


considered, 


2.7 HOW VALUFS ARE REPRESENTED ON THE MACHINE 


There are three general classes of locations for 
values on a machine: a value can be fn a register, it can 
be simply accessihle [fn core, or It can be fin core but not 
simply accessible. A value fs sImply accessible ff fits 
address can be put directly Into a computational machine 
Instruction, such an Add instruction. (Thus a value may he 
addressable in a special load Instructfon yet not simply 
accessible). For instance, a byte on the I8M=-360, even 
though addressable, is not simply accessible for 


computation. It must first be [fsolated in a register. 


Let us examine how a value might fall into each of 


these classes. 
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Registers: The only values which may be In a register (at 
the start of a macro expansion) are values computed by 


previous macro lines. 


Simply Acccessible: Simply accessible values include both 
results of previous macro lines which have been stored in 
temporarles (which are assumed to be simply accessible 
locations), and values declared fn the source program which 
have been mapped ftnto stmply accessfble core memory units. 


Chapter & explaltns exactly how this mapptng ts done. 


Not Simply Accessible: This class is composed of values 
which cannot be directly operated upon by computational 
Instructions. They must first be fsolated in a register 
before they can be used. Such values include individual 
bits, and bit-strings which are not on wholely accessable 


boundaries. 


2.8 LOAD/UPDATE ROUTINES 


The fact that not all values are simply accessible 
gives rise to the concept of a load/update paltr: a pair of 
routines to access and to update a value, The idea of 
characterizing a data ftem by a pair of load/update routines 


was first formulated by Strachey (11). <A simple example of 
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such an unaccesstIble data ftem fs a bit-string within a 
word, Its location might be represented by an address 
(perhaps indexed and based) and a bit displacement within 
the addressed memory unfit. Its load/update pair might 
consist to two routines which take the 'locatfon' and 


generate code as follows: 


1. Load Routine: 
a. load the memory unit (fe., word) [nto a register 
b. shift left to eliminate high-order bits 
c. shift right eliminating low order bits and 


right-adjusting the value in the register 


2. Update Routine: 
a. shift the new value to the correct target position 
b. load the target word [Into a register 
c. use a bit mask to zero out the target byte 
d. OR the two words together 


e. store the result 


In practice such a value has two kinds of 'location' 
and correspondingly two load/update pairs: one for when the 
location of the string within the word fis known at compile 
time, and one for when it is computed at run time. The 


routines are further complicated if a value extends across a 
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word boundary. 


The load/update problem arises from the fact that 
programmers are interested in values that do not map 
directly into accessible units. Generally only an address 
can be put into a machine instruction. If a computational 
machine instruction could accept an ndapess: starting bit, 
and bit length, then the complexity of the load/update 
routines would disappear. An alternate approach might be to 
have special hardware load and store instructions to access 
bits of a word, This would retain the load/update 
framework, but the routines would consist of only one 


instruction. 


2.9 MACHINE DESCRIPTION 


Ustng DMACS, a machine specifier can implement a 
language by describing vartious features of his machine. In 
the next two chapters, the detafls of such a description are 


examined [tn more detall. 


Parts of the ‘description’ consist of listing names 
of registers and of core memory units and of describing how 
these relate to one another. Another part of this 


description, however, involves writing short low-level code 
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CHAPTER ITI 


A CODE GENERATOR AS A STATE MACHINE 


3.1 OVERVIEW 

3.1.1 THE STATE MACHINE 

Chapter 3 presents a model of the logic of a 
computational macro. This model pictures a code renerator 
as a state machine whose state is determined by the location 
of the values used to generate code. The location of a 
value may he an accessihle core locatton, a non-accessihble 
core location, or a register. In the model, each 
computational macro has one or more permitted states for its 
operands from which code can be emitted. To generate code 
for a macro, the code generator must make a transition into 
one of the permitted states and emit a particular code 


sequence from that state. 


In a procedural macro definition language, the user 
explicitly specifies these transitions himself. Ina 
descriptive system such as DMACS, logic to perform 
transitions Its deduced automatically from machine- 
descriptive information. The chapter shows how such an 
automatic mechanism is built into DMACS to perform 


transitions given a machine description describing register 
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structure, permitted states for computation, and code 
sequences which perform these computations. Not 
surprisingly, the automatic mechanism makes certain 
restricting assumptions as to object machine structure. 
Thus, the model is a somewhat restricted one presented to 
isolate the basic ideas involved, and to provide a basts 


upon which a more general system can be built. 


3.1.2 THE STATE OF THE MACHINE 


In thts chapter, the term 'state' is used in two 


contexts: the 'state' of the code generator as a whole, and 


an input, output, or permitted 'state' of an Individual 


macro. 


1. The state of the code generator is determined hy the 


locations of all the values which are to he used as 
operands to any macro, 

2. The input state of a macro fis determined by the 
location of the values passed to it as operands. 


3. A permitted state of a macro is a particular 


configuration of operand locatfons from which code can 


be emitted. 
kh. An output state of a macro is determined by the 


location of the result of the computation, 
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3.2 A SIMPLE EXAMPLE 


The following simplified example illustrates how the 
state machine concept is used. The example concentrates on 


the integer addition for the IRM-360. 


1. Input States: For simplicity, let us restrict operands to 
two locations: 

1. registers of class 'R' (abhreviated 'R') 

2. accessible storage (abbreviated 's') 
Thus input states for two operands can be descrihed hy the 


following pairs (s,s), (s,R), (R,s), or (R,R). 


2. Permitted States: The IBM-360 has two instructions which 
perform Integer addition. Permitted states are (R,s), 
(s,R), and (R,R). From (R,s) and (s,R) a 
storage-to-register Add instruction, 'A', Ts emitted. From 
(R,R) a register-register Add instruction, ‘AR', fs 


emitted. 


3. A Machine Independent Macro: If the source languare 
allowed both integer and floating point operands, the 
language implementer might write a machine independent ADN 


macro with logic as follows: 


are discussed in Chapter &. A more detafled description of 


the 360's register structure is found fn Section 3.7. 


Next, the machine specifier defines integer addition, 

IADD al,a2 (commutative) 

from R(al),R(a2) emit AR al,a2 result R(al) 

from R(al),S(a2) emit A al,a2 result R(al) 
This description defines two permitted states, code to he 
emitted from each state, and the location of the macro 
result. In the first state, both operands are [In registers. 
From this state, an 'AR' instruction fs to he emitted, The 
result Is to be recorded fn the register containing the 
first operand. The declarations are used to fill out the 
MIML macro. The attribute ‘commutative’ indicates that 
addition is commutative, and thus R(a2),$fal) wlll be’ 


included as a permitted state without hefng declared 


explicitly. 


Notice that the declarations are essentially a 


description of 1!8M-360 ftnteger addition, 


5. Advantages: Because the state machine model is built 
Into DMACS, both the language Implementer and the machine 
speciffer find their tasks lightened. The languaze 

implementer can write a very simple source macro without 


worrying about machine structure. He need not perform tests 
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to ascertain the state of the operands, nor transform the 
input operand state fn any way. The machine specifier, tn 
turn, fs able to implement the macros by describing bis 
machine without worrying about the constructs of ths source 


language or the internals of the compiler. 


6. The Role Of DMACS: The machine speciffer defines his 


register structure, the permitted states filling out each 
‘primitive’ (such as IADD) in the machtne independent 
macros, the code sequences to be emitted from each permitted 
state, and his data pathways including load and store 
instructions. From this descriptive Informatfon, DMACS must 
deduce three things: how to select a target permitted state 
for a given input state, how to reach that state, and how to 
obtain a free register of a given class when, In the process 


of making a transition, ft needs to load a value. 


The remainder of this chapter deals with these topics 


in more detall and discusses the problems fnvolved. 
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3.3 MACHINE STRUCTURE 


3.3.1 REGISTERS 


The code generator must he able to manipulate values 
in and out of registers to attain permitted states. In 
trying to incorporate automatic register handling logic into 
DMACS, there are two conflicting goals. First, the user must 
be able to describe his registers flexihly enough to include 
a reasonably large class of machines. Second, there must he 
enough restrictions so that the logic which attains 
semaieeea states can be generated from this description 
automatically. These two goals conflict since the more 
flexible the model is, the harder ft is to incorporate into 
an automatic system. The assumptions as to register 
structure outlined in this section are restrictive, but 


provide a base for later extension of the model. 


In attaining permitted states the system must he able 
to find a free register of a given class, to load and store 
the contents of any register, and to transfer a value from 
one register to another. To allow this, the machine 


specifier defines the following: 


1. The Machine Registers: R =(rl,r2, r3 ...rn) 


2. Classes of Registers: (R1,R2,...Rn); RIC KR 

The classes are defined so that every register fs In at 
least one class, [If only by Itself, and so that any two 
classes are elther subsets, equal, or disjoint. There 


ts no parttal overlap. 


3. Pathways to Core: Fach class of registers is assumed 
to have a direct path to and from core. There is no 
need to go through a second register In efther loading 
or storing. This Is a simplifying assumption which 
might be relaxed in a more powerful extension of the 
model. (A stack machine, for Instance, does not conform 
to this assumption). The machtne spectfier must cefine 


the load and store instructtons used In these pathways. 


4. Paths between Registers: The machtne specifier must 


define any available register to register transfers. 


5. Relationships between Registers: The machine 
specifier may define relationships hetween registers. 
These can be used for such regIister-register 
relationships as even-odd pairs. He may also specify 
that In certain conditions the use of one register 
implies that a related register must he made available 


as well. 
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In this fashton, the user descrthes his register 
structure. Section 3.3.3 describes how this Information fs 
used by DMACS to construct the GETREG routine to obtain a 


free register of a given class. 


3.3.2 SAMPLE REGISTER DESCRIPTION: IRM-360 


rclass REG:r2,r3,r4,r5,r6,r7,r8,r9,r10,r11 

rclass ODDREG:r3,r5,r7,r9,r11 

relation EPAIR (stored: QDDRFC) 

r3:r2 

rSsru 

r7:r6 

r9:r8 

rll:rlO 

rpath WORD->REC: L  RFG,WORD 

rpath REG->WORD: ST RFG,WORD 

rpath REG->ODDREG: LR ODDREC,REC 

These declarations define two register classes. For 

each member of the class ODDREG, a related EPAIR register is 
declared. The attribute (stored:ODNREG) means that when an 
ODDREG register fs called for, its related FPAIP register 


must be made available as well. 
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3.3.3 THE 'GETREG' ROUTINE 


The GETRFG routine fs called by DMACS when in 
performing a transition a free register of a given class is 
needed, The routine must be adjusted by DMACS ustng the 
machine specifier's description of his register structure, 
so that it operates correctly for the particular machine 


registers and register classes involved. 


The GETRFG routine cycles through the register class 
it recelves as an argument attempting to find an empty 
register. If none are empty, the routine must choose a 
register to store based on the 'flags' attatched to the 
registers. The flags are used to protect values fn 
registers so that they will not he stored unless necessary. 
The mechanism for flagging is complicated hy the fact that a 
macro can be called as a subroutine during a given macro 
expansfon. The prforities used in protecting registers are 
discussed fn more detail in Section 3.5. The net effect of 
the priorities is that the most recently set registers are 
stored last. The GETREG routine also must handle situations 


when related registers must be freed at the same time. 


ie De 


3.4 THE AUTOMATIC TRANSITION 


3.4.1 {INTRODUCTION TO THE TRANSITION 


Performing a transition involves transforming any 
possible Input state of a macro into one of that macro's 
permitted states. The input state of a macro is determined 
by the location of the values passed to it as operands. 


These operands can be classed as follows: 


1. s - In an accessible storage location in core memory 
2. Ri - ina register of register class Ri 

3, RjJf - in a non accessthle core location, requiring a 
code generating load function which will isolate the 
value In a register of class Rj. (The concept of having 
to apply a function to an operand could apply to mode 
conversions as well as to loading non-accessihle 

values. Thus, although this paper deals only with Rjf 
values in a limited context, the concept involved isa 


more general one.) 


For the sake of simplicity, thts section deals only 
with two-operand macros, (such as AMD X,Y). For a two 
operand macro, input states are taken from 

(s U RI U Rif) X (s U RE U Rif) 


Permitted states are taken from 


dps 


(s U RI) X (s U RI) 


On a machine that has one class of registers (say R), 
Input states are (s,s), (s,R), (s,Rf), etc. Permitted 
states for an arbitrary macro might be (s,R) and (R,R). 
Thus a reasonable transition to make from Input state (R,s) 


is to permitted state (R,R). 


Performing the transition [fnvolves choosing a target 
permitted state to alm for and a path to reach that state. 
In the remafnder of the chapter, we assume that the task of 
performing a transition can he seen as two distinct 
problems: 

1. Selection of a target permitted state for each [input 
state, based on the cost of transforming each operand 


location. 


2. Given an Input state and [ts target state, 
determining In what sequence changes are to be made, 
These two steps are closely related. On some machines, the 
two steps can not be performed independently. For instance, 

on a stack machine one can not consider the cost of 
transforming the location of each operand individually 
without considering the sequence of transformations. Ona 


machine that conforms to the assumptions that we have made 
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about register structure, however, the separation of these 


two steps Is possible. 


3.4.2 SELECTING A TARGET STATE 


The selection of a target state for each posstble 
Input state of a macro Is the first step In setting up the 
automatic transition. If there is only one permitted state, 
this selection fs trivial. Otherwise a target for each 
input state must be selected from among several permitted 
states. Clearly, some criteria is needed for measuring the 
cost of changing states. For each fnput state, the permitted 
state yielding the lowest such cost can then be selected as 
a target. The cost criteria used [n this section fs the 
number of instructions required (not counting ftnadvertent 
storing of values since these are not predictahle in 
advance): Since we assumed in sectfon 2.3 that each 
register has a direct path to and from core, the maximum 
cost of changing the location of one operand is 2 (storing 
the value from one register, and then loading it fnto a 
second). Rif values, which require a function to load them 
into a register, are treated as if they were already in that 
register since the function is to he applfed in all cases 


and is thus a constant cost. 


Example: Assuming two register classes, R and R', anda 
register-register transfer, sample costs are: 


(input state)(permitted state)(cost) (comment) 


(s,s) (s,R) 1 load 

(s,R) (s,R) 0 

(s,R) (R,s) 2 store, load 
(s,R) (s,R') 1 transfer 
(s, Rf) (R,R) 1 load 
CRF,RF) (R,R) 0 


Figure 3.1: Sample Transition Costs 


An alternate cost criteria might be [Instruction 
execution times. In either case, the target selection 
algorithm simply selects for each Input state that permitted 
state which can be reached at lowest cost. The selection of 
a target state for each input state need not he performed 
every time the macro fs called. It can be compiled [nto a 


table at machine description time. 


3.4.3 SFOQUENCING THE TRANSITION 


Once a target state has heen selected for each Irput 
state, there remains the problem of deciding the order in 
which changes are to be made. This chapter first outlines a 
general strategy which accomplishes sequencing for all 
possible machine structures. The general strategy Is called 


‘blind’ sequencing for reasons that become apparent. The 
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Algorithm to select a target permitted state 
for an input state of a macro 


input state: fi = (11,12) 


Permitted states: pj = (pjl,pj32) j = 1,n 


target = nil 
mincost = 00 


totalcost < 
mincost 2? 


The function cost(i,p) determines the number of instructions 


required to change f{ to Pp. 


“EF “s 


chapter later discusses a second strategy called 
‘predictive’ sequencing which, when posstfble, might be more 
-effictent. The following discussfon concentrates on 'blind' 
sequencing, since blind sequenctIng always works and {fs a 


good vehicle for outlining the problem fnvolved., 


To illustrate the problem, let us consider the 
transition from input state (s,s) to target state (R,R). 


There are two possible paths, as this graph Indicates: 


In the graph, each node represents a state, square nodes 
represent permitted states, and paths from one state to 
another are represented by arcs. The arcs are labeled to 
indicate what operation is befng performed to which operand, 
Thus 11 Indicates that the first operand [ts helng loaded 


when that arc fs followed. 


In the graph, there are two paths from (s,s) to 
(R,R). %In this particular example, the paths are eaually 
efficient and efther can be selected. An [Important thing to 


notice In this example fs that every arc connects one 
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possible Input state to another. This Its true because we 
have taken all possfhle combinations of locations as 
possible fnput states. Since each arc must lead to another 
possible [nput state, each state can be examined In turn and 
gone labeled arc can be drawn from its node. Drawing this 
one arc for each state completes the graph. A deciston 
procedure for determining whfch arc to draw for a given 


state fs given In Figure 3.2, 


3.4.4 ACCIDENTAL TRANSITIONS 


There {fs one complication to be constdered in 


performing the sequencing. It fs fllustrated below: 


2 @ 
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Figure 3.2: Sample Sequencing Graph 


In input state (R,Rf), the first operand is already in a 
register. A code generating function fs to he applied to 
load the second operand. The function can generate code 
using registers, and even might call macros as subroutines 


to perform runtime computation. For instance, to load a hit 
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Algorithm For Blind Sequencing 


Problem: Given an input state and a target state, determine 
what operand to transform first. (Making this decision for 
each possible Input state completes the graph). 


Each operand can be expressed as one of the following: 


input Target 

s s 

R R 

RF R' (R' # R) 


The first operation to be performed to a given operand can 
be expressed as follows: 


Target 
Input s R R! 
1="load! 
s st='store' 
R t='transfer' 
Rf fi="apply 


function f' 
These operations can be given priorities: 

st > fl > f3 > f2> td 1 
The effect of the priorittes is to give highest precedence 


to storing values, next highest to applying functions, then 
to register transfers, and finally to simple loads. 


Sequencing Is done by labelling each operand of each input 
state by st, fl, f2, 3, t, 13 and then drawing the arc 
corresponding to the operation with the highest priority. 
{f both have equal priority, then either can be picked. 


Figure 3,3 
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value extending across a word boundary it is necessary on 
the 360 to load a register pair and perform double word 
shifts to isolate the value. Thus, the function may require 
the register containing the first operand to he stored. If 
this happens, a transition to (s,R) occurs, rather than to 
(R,R). This is called an accidental transition from an 
unstable state to an alternate state. To accomodate such 
unexpected but unavoidable transitions, the graph its 
augmented to fnclude dotted arcs from such unstable states 


to the appropriate alternate state. 


Figure 3.4: An ‘Unstable’ State 


The graph In Figure 3.4 fndicates that applying f2 to the 
Input state (R,Rf) should lead to (R,R) but might lead to 
(s,R). Full examples of such graphs are given in Figures 


3.5 and 3.6. 


Let us constder for a moment how such accidental 
transitfons can be fmplemented [In DMACS. In the performance 


of the transition (R,Rf)->(R,R), the register containing the 


input states: (s UR U Rf) X (s UR U RF) 
permitted states: (R,s), (R,R) 


Figure 3.5 


input states: (s UR U R' U Rf) X (s UR UR! U RF) 


permitted states: (R',s), (R',R) RONR' = 


a - 


Blind Sequencing Graphs 
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first operand should not be stored unless necessary. This 
can be assured by flagging that register's entry fn the RST 
(register state table). When storing any register, the 
GETREG routine must attempt to respect these register 

flags. After the function has been completed, the code 
generator must check to see if the first operand is still in 
a register. If not it must follow the dotted arc to reach 


its next state. 


3.4.5 A REVIEW OF SEQUENCING 


The sequencing strategy descrihed in the previous 
sections is called 'blind' sequencing because it involves 
applying load functions without looking ahead to see how 


these functions behave, Consider the following situation: 


Figure 3.7: Sample Secuencing Graph 


"Blind' sequencing arbitrarily selects one of these paths. 
If ft fs possfble to determine in advance whether a riven 


function can be applied without disturbing a value already 


in a register, then a more effective stratezy can be used. 
The second strategy is called 'predictive' sequencing. 
Predictive sequencing fs not so general as blind sequencing 
since for an arbitrary machine with register-register 
relations like even-odd pairs, the exact needs of an 
arbitrary function (even ff quite simple) may he very 


difficult to predict and control. 


This chapter concentrates on the blind seauencing 
approach. Predictive sequencing is mentioned primarily to 
put the problem in perspective. The main argument against 
developing a general predictive sequencing strategy for an 
arbitrary machine fs that it would be very difficult to 
design, and would run the risk of usIng more fnstructions at 
compile time than were ever saved at runtime. In code 
generation, ft is generally true that if elaborate 
optimization fs to be done, ft is most profitahly done on a 
fairly global basts, such as allocating registers over 
loops, removing invariance from loops, consolidating common 
subexpressions, etc. Blind sequencing has the advantage of 
affording a degree of local optimization (compared to a 
system which stores all registers hefore calling a function) 


without any really elaborate machinery. 


This concludes the introduction to sequencing. Let us 
step back for a moment and evaluate briefly what these blind 
sequencing graphs imply. A blind sequencing graph is 
compiled at machine description time by DMACS. Any 
particular graph applies to a particular machine, but the 
concept of such a graph fs a general formalism and is 


applied to all machines, 


Figure 3.8: Sample Sequencing Graph 


To understand the sfignifigance of this fact, let us consider 
what factors govern how frequently the accidental transition 
is followed from (R,Rf) in Figure 3.8. The frequency 
depends both on the source program and on the target 
machine. If the source program uses data-types which 
require simple accessing functions, then the dotted arc will 
tend to be followed less often than ff more complex 
data-types are used. Similarly, ff the machine has many 
available registers, the dotted arc will tend to he followed 


less often than ff the machine has few of them, 


Notice, however, that nelther the programmer nor the 
machtne specifier need even know that the prohlem exists. 
For that matter, nefther need the language Implementer. 

Only one person need ever worry about it- the DMACS designer 


who does all the worrylng for everyone. 


3.4.6 A GENERAL OVERVIEW OF A TRANSITION 


The automatic transition mechanism looks at the type 
of a macro's operands, looks at the permitted states, and 
then fnitiates one or more transformations to attain one. 
This process can be described In general terms: 

1. A system ts in a given state (certain values are in 
certain locations). 

2. It fs desired to transform the system to a new state 
with certain properties (particular values in particular 
locations). 

3. Functions are available which can effect a destred 
local change to part of the system, but with possibly 
unpredictable side-effects. (A load function applied to 
an Rif value might store an arbitrary value from a 
register). 

4h, It Is destred to make a sequence of such local 


changes and still have the resulting global state 


“BE 


well-defined. 


To accomplish this goal, the mechanism that generates 
permitted states must be able to detect when one function it 
applies stores a register that It expected to be loaded, and 
elther reload that value or else pick an alternate permitted 


state. Two potential problems in a system of this sort are 


deadlock and thrashing. 


1. Deadlock: Deadlock occurs if a value is irrevocably 
locked into a register by the flagging mechantsm, so that it 
can not be stored. If this were possible, it is easy to 
imagine a situation where a macro called as a subroutine 
might be unable to obtain the registers tt required. Such a 
deadlock can not occur [In the system outlined here, since 
the register flags are only fnterpreted as requests that a 


register not be stored unless necessary. 


2. Thrashing: In section 3.4.4, acctdental transitions were 
described. In such transitions, an attempt to reach one 
state results in an Inadvertent transition to an alternate 
state. One might wonder whether such inadvertent 
transitions could be repeated indefinitely. If so, then a 
thrashing situation might result, in which each sucessive 


attempt to reach a permitted state fs thwarted. Fortunately, 
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of a load function may result in macros being called as 
subroutines, each macro Invocation is numbered 
sequentially. That number is used to flag registers, In 
obtaining a register, the GETREG routine uses the following 
priorities: 

1. an empty register 

2. an unflagged register 

3. a register with the lowest flag (jie least recently 

set) 

Thus the most recently computed values are the most 

securely protected, As a result, if a register is 

loaded and no arbitrary functions are called ft can he 


e 


relied upon to remain tn its register. 


The process of macro expanstfon ftnvolves performing 


the following steps, I, II, and II! in sequence: 


!. Protect Values Already in Registers: First any values to 
be used by the macro which are already tn the correct 
registers are flagged. Such values include operand values 
as well as values to be used as Indices or bases to obtain 
an operand value. If a value in a register requires that a 
related register be stored, then make sure that register Is 


stored and flag it as well. 


deeermine 
Input 
node 


apply | 
operation 
to operand 
as indicated 
by arc 
from node 


has an 
accidental 
transition 
occurred ? 


is node 
permitted 
node ? 


follow dotted arc 
to next node 


Figure 3.9: Performing a Transition 


1!. Perform the transition to a Permitted State: Notice 
that It is in the process of following the sequencing arcs 
that the load functions and the GFTRFEG routine are called as 
subroutines. Load functions are called when a load is 
applied to an Rif value. The GETRFG routine is called when 


a load or transfer arc is traversed, 


It!. Perform Emissfton and Bookkeeping: 

1. For each operand [{n storage, load any Index or hase 
values which are not already loaded, 

2. Erase all RST flags set by this macro. 


3. Emit the code sequence associated with the permitted 
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node attained using the sequencing graph. The code sequence 
was specified by the machine speciffer when defining 
permitted states. 

4h. Erase the operands from the MRT and RST 


5. Record macro result, ff any, in the RST and MRT 


3.6 AN EXTENSION: OPERATIONS TO MEMORY 


A useful extension to the state machine concept, as 
outlined, Is to ftncorporate ‘operation-to-memory' 
instructions, such as ‘add-to-storage', It is simple to 
include this common class of instructions hy allowing the 


user to specify alternate destinattons for a macro result, 


Example: For the PDP-10, which has such instructions, an 
OMML definition for !AND (defined in section 2.1.2) can he: 
IADD al,a2 
from REG(al),REG(a2) emit ADD al,a2 result RFC(al) 
from REG(al),WORN (a2) emit IAND al,a2 result RFC(al) 
or emit IADDM al,a2 result WOPN(a2) 
(The [ADD and IANPM instructions being emitted are PNP-10 
opcodes.) The second state declaration indicates that [ff al 
is in a register, and a2 is in core, then an IADD 


instruction yfelds a result in the register, and an IADNM 


instruction yields a result in core. 
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To take advantage of such information, only four 
modifications need be made to the logic outlined fin this 


chapter: 


1. When emitting code: !f such a choice exists and the 
core location is a temporary, then defer emitting the 
instruction, and flar the Register in the RST, 


indicating the two fnstructions and the core location. 


2. In the GETREG logic: emit an operation-to-memory 
Instruction in preference to explicitly storing a 


value, 


3. In selecting a permitted target state: If there is a 
choice of input states due to deferral of such an 
instruction, then evaluate both possitle fnput states 
and select that one whose target has least cost. If the 
selection requires emission of an operation-to-register 
Instruction, continue to defer fits emission until ft fs 


clear that the value need not be stored, 


kh. After sequencing and prior to code emission: First 
emit any necessary op-to-register fnstructions for input 


operands which have been defered. 


Although these simple modifications to the DMACS logic 
certainly lead to no dramatic gains in efficiency, they do 


represent a useful extension to the state machine concept. 


3.7 SAMPLE MACHINE DESCRIPTIONS 


This section outlines the logic of two simple 
machine independent macros which mirht he written in MIME. 
Then OMML descriptions of the IP M-360 and of the PMP-10 


which ffll out the macros are given. 


Machine Independent Macro Logic: 


macro MUL X,Y 
if the types of X and Y are integer 
then IMUL X,Y 
else if the types of X and Y are floating 
then FMUL X,Y 
else error 


macro SUB X,Y 
if the types of X and Y are integer 
the ISUB X,Y 
else if the types of X and Y are floating 
then FSUR X,Y 
else error 


OMML Machine Description of the [8HM-360: 


The !8M-360 has one set of registers for integer 
arithmetic and another set for floating point arlthmetic, 
and therefore has separate pathways to and from these 
registers. For multiplication and division of integer 
operands, even-odd pairs of registers are used, 
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rclass REG:r2,r3,r4,r5,r6,r7,r8,r9,r10,r11 

relass ODDREG:r3,r5,r7,r9,rl1l1l 

rclass FREG:fr0,fr2,fru,fr6 

relation EPAIR (stored: 0DDRFG) 
r3:r2,rSsr4,r7:r6,r9:r8,r11:10 


rpath WORD->REG: L REG,WORD 
rpath REG=>WORD: ST REG,WORD 
rpath FREG=>WORD: LE FREG,WORD 
rpath WORD->FREG: STE FREG,WORD 


IMUL m1,m2 (commutative) 
from ODDREG(m1),REG(m2) emit MR EPAIR(m1),m2 result ODDRFG(m1) 
from ODDREG(m1),WORD(m2) emit M EPAIR(m1),m2 result ODDRFG(m1) 


(On the I1BM-360, multiplication requires one operand fn an 
‘odd' register. The multiply tnstructfion must refer to its 
even pair.) 


ISUB sl1,s2 
from REG(s1),REG(s2) emit SR s1,s2 result RFG(s1) 
from REG(s1),WORD(s2) emit S sl1,s2 result RFC(s2) 


FMUL m1,m2 (commutative) 
from FREG(m1),FREG(m2) emit MER ml,m2 result FRFEG(m1) 
from FREG(m1),WORD(m2) emit MF ml,m2 result FRFC(m1) 


FSUB sl,s2 

from FREG(s1),FREG(s2) emit SER sl1,s2. result FRFf(s1) 

from FREG(s1),WORD(s2) emit SF sl,s2. result FREf(s1) 

from FREG(s2),WORD(s1) emit LNER s2,s2;AE s2,sl result FREG(s2) 


(Notice that since a ‘complement regtster' instruction, 
LNER, exists for floating point, a state can be specified 
with s2 In a register and sl in core). 


OMML Machine Description of the PDP-10: 

The PDP-10 has one set of registers for both integer 
and floating point arithmetic. Since the PNP-10 has 
operation-to-memory Instructions, all memory-register state 
declarations {include two alternate destinations, 
rclass REG:a,b,c,d,e,f,2,h,1,j,k,1,m,n 
rpath REG->WORD: MOVEM REG,WORD 
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rpath WORD->REG: MOVE REG,WORD 


IMUL m1,m2 (commutative) 
from REG(m1),REG(m2) emit IMUL ml,m2 result REG(m1) 
from REG(m1),WORD(m2) emit IMUL m1,m2 result RFEG(m1) 
or emit !MULM ml,m2. result WORD(m2) 


{SUB sl1,s2 
from REG(s1),REG(s2) emit ISUR sl1,s2. result RFEG(s1) 
from REG(s1),WORD(s2) emit !SUR sl,s2 result RFG(s1) 
or emit ISURM sl1,s2 result WORD(s2) 
FMUL ml,m2 (commutative) 
from REG(m1),REG(m2) emit FMPR ml,m2 result REG(m1) 
from REG(m1),WORD(m2) emit FMPR ml,m2 result RFG(m1) 
or emit FMPRM m1,m2 result WORD(m2) 
FSUB sl,s2 
from REG(s1),REG(s2) emit FSBR ml,m2 result REGC(s1) 


from REG(s1),WORD(s2) emit FSBR mi,m2- result RFEC(m1) 
or emit FSBRM ml,m2- result WORD(m2) 


3.8 SUMMARY: THE STATE MACHINE 


The chapter outlines how a code generator performing 
computations can be pictured as a state machine. Then ft 
shows how the state machine can be formalized and 
incorporated into DMACS, a system for bullding machine 


independent code generators. 


Once the state machine model is incorporated into 
DMACS, it becomes a tool that a language implementer can 
use, It is a convenient tool since it frees the language 


implementer from worrying about machine structure, from 
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having to perform tests. to determine tnput states to his 
macros, and from having to implement transittons to 
permitted states. Thus, the macro loric that the languare 
implementer specifies need only deal with particular 
semantic features of his source language. Therefore the 
semantics of the source language are logically divorced from 
any one target machine's structure. As a result, these 
macros become much simpler to write. Also, once these 
machine independent macros are written, they can be 
implemented for a variety of machines from a machine 


description. 
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CHAPTER IV 


DATA REFERENCE MACROS 


4.1 INTRODUCTION 
4.1.1 OVERVIEW 
Chapter 3 described a state machine mode! which ifs 
bullt into DMACS and used as a tool to create machine 
independent macros which can be filled out from a machine 
description. The state machine is useful to help model 


computational macros. 


Chapter & turns to the problem of achieving the same 
machine independence for data reference macros. To achieve 


this goal, a data definition facility is buflt into DMACS, 


Source 
Data 
Declaration 


Target 
Machine ——- DMACS 


Description 


Source Data 
Described 
For Target 
Machine 
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A machine specifier describes his machine memory structure 
and describes how source data Items are mapped [nto that 
memory. From this description DMACS characterizes source 
data ftems [n terms of the primitives of the data definitton 
facility. The language designer writes his data reference 
logic In terms of the primitives of the facility using two 
built-in functions, called the INCREMENT and CONVERT 
functions. These functions operate on the primitives of the 
data definition facility. In effect, these two functions 
represent a machine [Independent model of data reference 
logic. A language implementer can write data reference 
macros In terms of these built-in functions without 

worrying about how the data items of his languare map Into 


the core memory of a particular machine, 


Chapter & fs not an extension of Chapter 3. It 
pursues a similar goal in a new area: machine independence 
for data reference macros similar to that acheived fn 


Chapter 3 for computational macros. 
4.1.2 DATA REFERENCE 


This section introduces the reader to the term 'data 
reference’ as used in this chapter, and gives a simple 


example of data reference macros in action, The data 
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reference constructs dealt with fn this chapter are 
subscripted structures as found In PL/I. P1/! fs chosen 
both because [ft is a well known language and also because ft 
has powerful data referencing constructs. For simplicity 
the chapter deals only with structures whose size fs static 
and known at compile time. This restriction eliminates some 
of the messiness of PL/!I's structure implementation and lets 
us concentrate on the basic problems of making such 
references machine independent. If we allow dynamically 
varying structure sizes, then we must worry about what logic 
can be performed at comptle time and what logic must he 
performed at run time by generated code, Restricting our 
attention to static structures frees us to concentrate more 
fully and more clearly on machine indenendence of data 
reference, rather than on the details of implementing 
dynamic structures for PL/1. The restrictions still allow 


useful and flexible data referencing constructs. 


A sample structure fs the following: 
declare 1 A (10) fixed, 


2 =X, 
2 8B (10), 


2-0 02)¢ 


This declaration defines a subscripted structure. The 
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ACL) xX 
B(1) Y(1) 
Y(2) 
¥(3) 
C(1) 
C(2) 
C(10) 
Z 
L 


e 


e 


B(10) oe 


z 
A(10) x 
B(10) [ 
Q(1) 


Q(2) 


Sample Structure Layout 
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i 35 §S A, | 


141 SUBST 1,B 

1+2 SS 1+2,J 
1+3 SUBST 1+2,C 
1+4 ss 1+3,K 


This approach can handle any structure reference in a 


simple, general fashton,. 


Having discussed the data reference macros’ to he 
dealt with, we now present a simple example of how code 
might be generated for the macros outlined above. This 
simple example assumes that the fttems all represent full 
words of data on some particular machine, Later we shall 
extend this simplified sftuation to allow more complicated 


data items. 


Each structure item fs characterized by two numbers: 
1. an offset from the beginning of its substructure 
element 
2. an element length 
The structure item 8B, for instance, has an offset of 1, and 


an element length of 184. 


In generating code for the macros ahove, two running 
totals can be kept: compile time words- Cll, and runtime 
words- RW. The running total represents a displacement Into 


the structure. At the end of the set of macros, the 


displacement points to the correct terminal data ftem. 
The following logic fllustrates how the macros might be 
expanded. In the ftnterest of clarity and simplicity, the 
code generated fs not as optimal as ft might he. 
SS A,! records the offset of A, which fs 0, In CW, and 
generates code to multiply the element length of A (143) 
by I-1. The result of the multiplication hecomes RW. 
(l-1 fs used In the multiplication on the assumption 
that the first (zeroth) element Is defined as A(1).) 


SUBST 1,B adds at compile time the offset of B, which fs 
1, to CW. 


SS i+l,J generates code to multiply the element length 
of B, 14, by J-l and add the result to RW. 


SUBST {+2,C adds at compile time the offset of C, 3, to 
Cw. 


SS i+3,K generates code to add K-1 to RW. (no 
multiplication fs necessary since the element length of 
C is 1). 
The result of all the computation is a pair of values 
(CW, RW) which represent a compile time displacement anda 
runtime index pointing to the desfred data Item. On a 
machine like the IBM<-36€0, this pair can be put directly into 


a machine instruction (ie. Load, Add) to access that data 


item, 


The above example itlustrates the general operation of 
data reference macros. It is shown later that an expanded, 


but similarly clean, framework can be used to handle data 
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more complicated than the full word ftems of this example. 
When data items are bytes and bitstrirgs, the logic of the 
macros is somewhat more complicated, and the eventual result 
is not a simple full word pointer, but rather a 'location' 
that can be input to a load/update routine which accesses 


the data-item pointed to. 


The most important point to notice in the example ifs 
that each structure item is characterized hy an offset and 
an element length and that on different machines, these 
offsets and lengths might be different. <A terminal data 
item is also characterfzed by two additional parameters, a 
load/update pair to access the item, and a data length which 
need not be the same as the element length (For instance, an 
array of 5-bit bitstrings alligned on word boundarfes would 
have a data length of 5 bits, but an element length of one 


word.) These too could vary for different machines. 


4.2 THE DATA DEFINITION FACILITY 


4.2.1 DESCRIPTION OF DATA 


The previous sectlion examined a simple example of data 
reference. This section presents a more precise framework 
for describing the type of data with which the chapter is 


concerned. A data item can be characterized by a h-tuple 


a | ae 


(OF,EL,DL,LU) describing how it fs Implemented on a 
particular machine. 


OF- the offset of the data item from the origin of the 
structure element to which it belongs 


EL- the element length of the structure element which 
that data item defines 


DL- the data length- the length of the piece of data 
which the data item represents 


LU- the load/update pair which accesses the data item. 


OF and EL can charactertze any data item. DL and LU apply 


only to terminal data items. 


The following example illustrates how data ftems 
declared ina particular source program might he implemented 


differently on two different machines, the IRM-360 and the 


PDP-10: 
declare 1 A packed, 
2B fixed, 
2 C char (2), 
2 D char; 


The PDP-10 fs a word addressed machine with 36 bits/word. 
Assume a data item of type ‘fixed’ to be defined as a word 
item, and a character to be defined as a nine bit item. The 
IBM-360 is a byte addressed machine with 8 hbits/byte. 

Assume a fixed data item to be defined as a word (four byte) 


item, and a character to be defined as a byte. Section 


=, Poe: 


4.2.3 describes how such definition ts done, Granting these 
assumptions, the storage layout for structure A Is as 


follows: 


PDP- 10: 


< 
IBM- 360: 


word x word ? 


—__—_ - 
— B — <C — 


< > 
byte 


Thus the data item 'A.D' fs descrtbed as follows on the two 


machines: 
1. on the PNP-10 


OF~ 1 word, 18 hits 

EL- 9 bits 

DL- 9 bits 

LU- the loadupdate routine for bitstrings 


(Notice, as an aside, that ff one wanted to pack 5 
seven-bit characters into a word, then instead of an 
element length, this ftem would have two numbers 
associated with ft, 36 and 5, Any fndex Into an array of 
such characters would be multiplied by 36 and divided by 
5 to yield a bit displacement. ) 


2. on the 18M-360 


OF- 6 bytes 
EL- 1 byte 
DL- 1 byte 


LU- the load update routine for bytes, 


we PR is 


4.2.2 DATA DEFINITION 


To allow data reference macros to operate over data 
which can be descrthed differently for different machines, 


certain problems must be solved, 


1, Suitable primitives must he found, flexihle enough to 
describe offsets and lengths of data for a number of 


machines, 


2. An algorithm must be written which takes a structure 
declaration and a machine description and computes 
offsets and lengths descrihing that data for that 


machine, expressed in terms of these primitives. 


3. Data reference macros must be written fin terms of 
these primitives, so that these macros will he machine 


independent. 


DMACS solves these three problems hy using a built ir 
data definition facility. The primitives of the data 
definition facility are addressable unfts and hits. AT] 


data fs ultimately descriked tn these terms. 


The remainder of the chapter first outlines Fow these 


primitives can be deduced from information supplied by a 


a PF ce 


machine speciffer., The chapter then illustrates how DMACS 
can help a language implementer write data reference macros 


in terms of these primitives. 
4.2.3 DEDUCTION OF PRIMITIVES FROM A MACHINE DESCRIPTION 


DMACS characterizes data for any machine in terms of 
addressable units and bits. Information to make this 
characterization must be deduced from the machine 


description which specifies the following: 


1. core memory units: The mackine speciffer defines hfs core 
memory unfts (such as hits, bytes, words, double words, 
etc.), how these map into each other, and wkich is 
addressable. 
A sample declaration for the IBM 360 follows: 

mem BIT 

mem BYTE (8 BIT, addressable) 

mem WORD (4& BYTE, boundary 4) 

mem DWORD (8 BYTE, boundary 8) 

The attribute ‘boundary 4&' indicates that an 


element with storage class WORD has an address 
congruent to zero, modulo 4&, 


2. Source data types: The machine specifier must indicate 
which storage unfit each source data type Is to he mapped 


into. It is here that a character data ftem mirht he defined 
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as a byte on the IRM-360 and as a bitstring on the PDP-10. 
For a language where data can he packed or allirned, both 
storage units are indicated. DMACS uses this information 
together with the core memory unit information to determine 
the offsets and lengths of data items from a source 
program, 

map fixed to WORD 

map char to BYTE 

map bit unalligned to BIT 

map bit alligned to BIT allign WORD 

The last declaration indicates that when a ‘hit! 


data item has been declared to he ‘'allfgened', it 


Is to be alligned on a WORD boundary. 


3, Load/Update routines: For each of the memory units, the 
machine specifier must define a load/update routine to 
access source data items mapped by the specifier into that 


memory unit. 


Some storagze classes may bhave simple routines: 
mpath WORD=->REG: L REG,WORD 
mpath REG->WORD: ST RFG,WORD 
mpath BYTE->REG: SR REG, RFE; IC RFG,BYTE 
(etc. ) 
Other load/update routines are more complicated, 


and are discussed fn section 4.2.4. 
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When a language data declaration fs processed, 
information from such a machine descriptton must he used to 
compute offsets and lengths for each data ftem. The offsets 
and lengths can each be descrihed by a 2-tuple (addressable 
units, bits). The tuple (4,2), for Instance, stands for & 


addressable units and 2 bits. 


As a simple example, consider the following 


structure: 


declare 1 Z, 
2 A fixed, 
2B bit (12), 
20 Bre 63), 
2D 2 bit alligned;: 


The following table ftndicates how the structure might he 


described for the IRM-360 and the PDF-10: 


(data) (offset)(length) (offset )Clenzth) 

A (0,0) (4,0) (0,0) (1,0) 

B (4,0) (0,12) (1,0) (0,12) 

C (5,4) (0,3) (1,12) (0,3) 

D (8,0) (0,2) €2,.0) (0,2) 
IBM=-360 PNP-10 


Each tuple represents addressable unfts and bits. A is a 
fixed data item which is mapped into a full word on hoth 
machines (and hence 4& addressable units on the 360), and RB, 
C, and D are mapped into bits. The flowchart of a general 
algorithm which will take a structure and descrihe ft using 


these primitives is given in Figure 4,1, 


= BO = 


This algorithm fllustrates how offsets and element 
lengths can be computed for data items. Two stacks are 


used: DISP and STACK. 


' STACK 


The stacks are pushed each time a new structure level is 
encountered, and are popped eack time a level ends. Fach 
entry of DISP has two fields, one for addressable units and 
one for bits, which record displacement from the beginning 
of the current structure level. STACK is used to store the 
name of the current data item at each level. For each data 


item an offset (OB)and an element length (FL) fis computed. 
he2.4 COMPLEX LOAD/UPDATE ROUTINES 


The previous section gave examples of simple 
load/update routines for addressable data [tems. 
Load/update routines for non-addressahle items (je. hit 
strings) are more complicated for several reasons. 

1. They take as input an address, a hit displacement, 
and a bit length. 
2. Bit displacements can be runtime or compile time 


values, 


= 27 = 


= next data item 


a terminal 
element 


on this level 


Is Temp increment CURR=STACK(i) 
alligned EMP appro EL of CURR=DISPCi 


orrect] priately izi-1 

EL of CURR= ts CURR 
Temp-DISPC(i); sub- 
DISPCi)=EL* cripted 


subscriptrange 


of CURR 
alligned Displ(i+ 


correctly a appropri- F 


Figure 4&.1 


Disp(i)=DispCi)+ 
Disp(it+l)«* 
subscriptrange 
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3. Bitstrings can run across word houndartes,. 


Two possible solutions are available to handle this 
problem. The easiest solution is to reauire that the 
machine specifier provide a subroutine which makes the 
appropriate checks, and executes the correct load and shift 
Instructions for the different situations. The second 
solution is to allow the machine specifier to define open 
code sequences to be generated, at least for the simpler 
cases (for instance, when bit displacement is known at 
compile time, and hence it can be determined that the item 


does not cross a word boundary). 


A sample load routine for the IBM 360 might appear 
somewhat as follows: 
mpath BIT->RFG: L REG, WORN 

SLL RFG,DISP 

SRL RFG,32-LEN 
The whole problem of exactly how to allow a machine 
specifier to define open sequences of this sort ts a 
difficult one. It ts to a large degree an implementation 
problem for a DMACS bullder, rather than a conceptual 


problem of machine fndependence. Itt Is therefore left 


somewhat open [In this chapter. 


4.3 MACHINE INDEPENDENT MACROS 


4.3.1 DATA MACRO LOGIC 


The previous sections have fllustrated how data ftems 
on any machine can be characterized in terms of the 
primitives of a data definition facility. This section 
describes how machine fndependent data macros can he written 
Tn terms of the primitives of the facility In a clean, 


simple fashfon. 


As outlined in section 4.1.2, the operation of a data 

macro consists of incrementing a pointer into a data hase, a 
pointer consisting of hoth runtime and compile time values. 
In the machine fndependent macros which a language 
implementer writes, all offsets and lengths are expressed in 
addressable units and bits. Thus the pointer heing 
incremented can he seen as a 4-tuple: (CA,RA,CR,RP), 

CA- compile time addressable units 

RA- run time addressable units 

CR- comptle time bits 

RB- run time bits 
Any element may be nil: fe. ff CA is nil, the pointer has 
not been incremented by any compile time addressable units. 


The process of jincrementing the pointer can he expressed hy 


the following graph, called the INCPEMFNT function: 


- &h - 


ca/Add-c(CA,ca) 


cb/nil 


we & 


cb/Add~c(CB,cb) rb/Addrr(RB,rb) 


This graph contafns four pairs of states. Fack patr is a 
state machine recording the presence or absence of one 
element of the &-tuple. Hence, If the pointer has no 
runtime bits, the third pair ts in the state 'nil'. Fach 
pair starts in the state 'nil', tnput is represented hy 
‘ca’, 'cb', 'ra', and 'rb'. Actions to be taken are either 
Add-c, representing addition at compile time, or Add-r, 
representing additlon at run time, or nil. When the pointer 
is incremented by a new value, the appropriate state machine 
makes a transition. If this is the first transition for 
that machine, then there fs a change of state with no action 
performed, If this is not the first transition, then either 


a compile time "Add-c' js performed, or code jis penerated to 
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to perform a run time 'Add-r', 


The graph is a machine independent model of the 
operation of a data reference macro, It Is machine 
independent because it operates on the primitives of a data 
definition facility in which data for a variety of machines 


can be automatically expressed. 


Since the INCREMENT function is machine Independent 
it fs built into DMACS, Using this function, the language 
designer can write his macros without worrying ahout how 
different data items map into core. tna similar manner, a 
CONVERT function to convert the pointer into a data item 
‘location’ (to he input to a load /update routine) is built 
into DMACS, The logic for this routine ts discussed fn the 


next section. 


Using these two built fn routines, a languare 
designer can write a subscript macro with the following 
logic: 


SUBSCRIPT X,|t 
1. subtract 1 from | yielding Value(I-1). 
2. tf the element length of X is (1,0) or (0,1) 
then INCREMENT X by Value(I-1) 
else multiply Value(l-1) by the element length of X 
and INCREMFNT X by the result 
3. tf X is a terminal data ftem 
then apply CONVERT to the pointer computed above, 
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(When incrementing X with a value, the units of the element 
length of X determines which component of the pointer fs 
incremented. If 1! is a compile time value, the subtraction 
and multiplication can he done at compile time. Otherwise 
code must be generated to perform these operations at run 
time, ) 

The macro is completely free of machine dependent 
detail. Using the two functions huflt [into DMACS, the 
language implementer is able to write a macro dealing only 
with the semantics of his source language. For instance, 
such a macro might fnelude logic to handle sukscript bounds 
or to handle special types of subscripting such as for 


triangular matrices, but need pay no concern to machine 


structure at all. 
4.3.2 THE CONVERT FUNCTION 


The CONVERT function takes a pofnter in the form of 
a 4-tuple (CA,RA,CR,RR) as discussed ir the previous 
section, and converts it Into a form suitahle for use hy a 
load/update routine. When the pointer fs expressed in 
addressable units and references a simply accessahle item, 
conversion is not necessary. When the pointer includes 
bits, however, the hit elements of the pointer must he 
normalized to yield a number of addressable units, and a 


local bit displacement within the memory unit pointed to hy 


the addressable elements RA and CA, 


First we discuss the problem of normaltzation, and 


then how [ft fits [nto the CONVFRT lortc as a whole. 


NORMALIZATION: Constder the problem of accessing a 
bitstring on the PDP-10 and on the !PBM-360 given the hase 
address of a data area and a hit Index into it. On the 
PDP-10, the Index should be divided by 36 (hits/word), 
ylelding a full-word fndex as the quotient, and the bit 
displacement as rematnder. On the 36C, assuming the 
load/update routine uses full word load [fnstructions, the 
address of a full-word boundary fis wanted, together with a 
bit displacement to within that word. Therefore, the index 
should be divided by 32 (bits/word), yielding a bit 
displacement as remafnder. Multiplying the quotient hy 4& 
would then yield an index In addressable units. Thus a data 
type may have the following attributes when implemented on a 
particular machine: Nd- a number to divide a bit pofnter by, 
to yield a 'local' bit pointer as a rematnder, Na- a number 
to multiply the result of that diviston hy to yield 


addressable units. 


When the bit pointer fs a compile time value, this 


normalization is performed at compile time. Otherwise, code 
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must be generated to perform the normalization at run time. 


CONVERT: The convert function takes a pofnter 
P=(CA,RA,CB,RR) and converts It to a location L=(CA,RA,ch1) 
or L=(CA,RA,rbl), where chl and rbhl are local bit 
displacements into a memory unft. The logic of this 


function fs: 
1. If CB=ntl and RB=nil, then normalize CR at compile 


time ylelding ca and chl. Then INCPFMFNT P by ca 


2. If RBFniT then do 
( a. if CB# nil then 
(generate code to add RR and CR yielding RP) 


b. generate code to normalize RR yielding ra and 


rb] 
c. INCREMENT P with ra ) 


This function yields an expresston which can he input 
to a load/update pair. This function operates on the 
primitives of a data-definition factlity and is therefore 


machine-independent. 
hou SUMMARY 


The chapter describes how a data definition facility 
is built into DMACS to facilitate the writing af machine 
independent macros. Then ft discusses how this facility jis 
used: how the machine specitffer describes his machine 
memory, accessing functions, and the mapping of source data 


types Into core; and how PMACS then uses the information to 
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compute the primitives whitch describe a source program's 
data. The chapter then discusses how machine fndependent 
macros are written in terms of two machine independent 
functions (INCREMENT and CONVERT), operating over these 
primitives. These two functions emhody the substance of the 
machine related part of data reference macros. They are 
buflt into DMACS to be used as a tool by the languare 


implementer. 


The basic concept set forth in this chapter fs the 
use of a data definitional factlity. The rest of the 
chapter fs built around this idea, I[t Is [Instructive to ask 
how much more flexibility the deffnitional facility affords 
over a code generator for a single target machine. At first 
glance, it might appear that the definitional facility 
merely lets DMACS describe data with different numhers on 
different machines, but perform the same manipulations with 
those numbers in all cases. This fs not true. The 
definitional facility gtves the language implementer the 
ability to handle a given source data reference with 
different sections of his logic on different machines. Thus 
an array of characters can be handled for the IRM-360 as an 
array of addressahle unfts with element length of 1, whereas 


on the PDP-10, [ft would be handled as an array of hitstrings 
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of length 9. The operations performed on these element 
lengths could be different, and the load update routines 


used to access the jftems could he different. 


Thus the definitional facility of DMACS provides a 
flexible interface between machine structure and macro 
logic. At the same time, ft is an interface that is almost 
invisible to both the machine specifier and the language 
implementer. The language implementer is able to think 
primarily in terms of the semantics of his language 
irrespective of machine structure, and the machine 
specifier merely gives a description of his machine. ODMACS 
takes care of binding the macros and the machine description 


together. 


CHAPTER V 


CONCLUSIONS AND FURTHFR WORK 


5.1 AREAS FOR FURTHER WORK 
5.1.1 FURTHER ASPECTS OF CONF CENFRATION 
The scope of the present research fs limited since ft 
does not address the task of making an entire compiler 
machine Independent. Only two classes of macros are 
studied, and only a limited set of possthle operand types 
are allowed. Also, many machtIne tdiosyncrastes, such as 


interrupt handling, are Ignored, 


The problem of making a powerful compiler machine 
independent Is a difficult and a messy one. The problem fs 
somewhat softened by the fact that many machtfne 
Idiosyncractes can properly be handled by subrouttInes, and 


thus may not prove to be fnsurmountable stumbling blocks. 


One signiffigant area not dealt with fs the class of 
control macros, such as subroutine calls, entry and return 
macros, etc. These macros may not, however, reaulre any 
elaborate mechanisms to allow machine Independence. In 
general, such control macros are fmplemented very similarly 
on different machines and may he descrihahle merely hy 


appropriate code sequences. Qne minor prohlem is to assure 
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that the stack is allocated in the correct units. 


Types of operands not constdered include character 
strings and decimal operands such as those found on the 
(BM-360. Both of these types of operands are generally not 
manipulated vita registers, but rather by subroutine or by 
special memory-memory tnstructions. The model of 
computation in Chapter 3 fs orfented primarily towards 
manipulating values using registers. More work fs also 
needed to determine exactly how load/update routines can 
best be defined to fit into a machine indenendent 


framework. 


5.1.2 EXTENDING THE MODELS 


The models presented fn this paper are set forth 
primarily to isolate some basic {ideas fnvolved in code 
generation, and to provide a basis for more general 
extensions which could include a broader spectrum of machine 


structure. 


In particular, one might relax some of the 
constraints imposed on register structure [In Chapter 3, 
(perhaps to include such machines as a stack machine), and 


develop an automatic mechanism for attaining permitted 


states in this less constrained system, By relaxing 
constraints in this fashion, it might be possihle to ohtain 
a number of different automatic mechanisms, together with 
classes of machine structures which can he handled by each 


mechanism, 


In a similar vein, one might consider different 
possible addressing structures, and determine how the 
machine independent data reference logic can he modified to 
accomodate them. In particular, ft might he useful to look 
at addressing on small machines, such as the PDP=-8, which 
tend to have anomolous addressing strategies due to bit 
conserving design considerations. In fact, such machines 
might be practical candidates for a descriptive system like 
DMACS, stnce they tend to be reasonably similar, and since 
they tend to be unsuitable for sustaining compilers 


themselves. 
5.2 SUMMARY OF RESULTS 


The present research has examined the two most common 
types of macro used for handling arithmetic values: 
computation macros, and data reference macros. For each of 
the two types of macro, the paper develops a machine 


independent formalism which models the machine dependent 
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aspects of the macro's logic: a state machine for 
computation macros; and the INCREMENT and CONVERT functions 


for data reference macros. 


Chapters 3 and & show how the models can be 
incorporated fnto DMACS, a descriptive macro system. A 
language implementer can use the models as tools, writing 
his macros in terms of machine indeperdent primitives which 
Invoke the model. A machine specifier can then describe his 
machine, and descriptively fill out the primitives as they 


apply to his machine. 
Thus the research has several purposes: 


1, The research is a first attempt to formalize some of the 


logic involved in generating code for high level languages. 


2. The research is an attempt to see what Is involved in 
attaining machine ftndependence in a code generator, similar 
to the language fAdépendaice and the token independence 
achetved by automatic parsing and automatic lexical 


systems. 


3. Towards this end, this paper explores the question of 
just what might reasonably constitute a ‘description’ of a 


machine. 


4k, The research helps make clearer the distinctior hetween 
the semantics of a high level lanruare and the structure of 
a target machine, a distinction that is often unclear tna 


compiler oriented towards a single machine, 
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